Minimal Test Collections for Relevance Feedback

نویسندگان

  • Ben Carterette
  • Praveen Chandar
  • Aparna Kailasam
  • Divya Muppaneni
  • Sree Lekha Thota
چکیده

The Information Retrieval Lab at the University of Delaware participated in the Relevance Feedback track at TREC 2009. We used only the Category B subset of the ClueWeb collection; our preprocessing and indexing steps are described in our paper on ad hoc and diversity runs [10]. The second year of the Relevance Feedback track focused on selection of documents for feedback. Our hypothesis is that documents that are good at distinguishing systems in terms of their effectiveness by mean average precision will also be good documents for relevance feedback. Thus we have applied the document selection algorithm MTC (Minimal Test Collections) developed by Carterette et al. [6, 4, 9, 5] that is used in the Million Query Track [2, 1, 8] for selecting documents to be judged to find the right ranking of systems. Our approach can therefore be described as “MTC for Relevance Feedback”.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parsimonious Relevance Models for Multiple Corpora

We describe a method for applying parsimonious language models to re-estimate the term probabilities assigned by relevance models. We apply our method to six topic sets from test collections in five different genres. Our parsimonious relevance models (i) improve retrieval effectiveness in terms of MAP on all collections, (ii) significantly outperform their non-parsimonious counterparts on most ...

متن کامل

Toshiba BRIDJE at NTCIR-6 CLIR

At NTCIR-6 CLIR, Toshiba participated in the Monolingual and Bilingual IR tasks covering three topic languages (Japanese, English and Chinese) and one document language (Japanese). For Stage 1 (which is the usual ad hoc task using the new NTCIR6 topics), we submitted two DESCRIPTION runs and two TITLE runs for each topic language. Our first search strategy is Selective Sampling with Memory Rese...

متن کامل

Toshiba BRIDJE at NTCIR-6 CLIR: The Head/Lead Method and Graded Relevance Feedback

At NTCIR-6 CLIR, Toshiba participated in the Monolingual and Bilingual IR tasks covering three topic languages (Japanese, English and Chinese) and one document language (Japanese). For Stage 1 (which is the usual ad hoc task using the new NTCIR6 topics), we submitted two DESCRIPTION runs and two TITLE runs for each topic language. Our first search strategy is Selective Sampling with Memory Rese...

متن کامل

Flexible Pseudo-Relevance Feedback via Direct Mapping and Categorization of Search Requests

This paper explores various strategies for enhancing the reliability of pseudo-relevance feedback using TREC and NTCIR test collections. For each test request, the number of pseudo-relevanct documents ( ) or the number of expansion terms ( ) is determined based on a similar training request (i.e. via direct mapping) or a group of similar training requests (i.e. via categorization). The results ...

متن کامل

On the Evaluation of the Quality of Relevance Assessments Collected through Crowdsourcing

Established methods for evaluating information retrieval systems rely upon test collections that comprise document corpora, search topics, and relevance assessments. Building large test collections is, however, an expensive and increasingly challenging process. In particular, building a collection with a sufficient quantity and quality of relevance assessments is a major challenge. With the gro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009